Universal Access To All Knowledge
Home Donate | Store | Blog | FAQ | Jobs | Volunteer Positions | Contact | Bios | Forums | Projects | Terms, Privacy, & Copyright
Search: Advanced Search
Anonymous User (login or join us)
Upload

Reply to this post | See parent post | Go Back
View Post [edit]

Poster: aibek Date: Jan 8, 2014 2:37am
Forum: texts Subject: Re: bug in text search

Another problem with the same page:

Searching for the term Präfixes, which occurs twice on the page (both in Notes and Descriptions), sometimes presents the desired item as a result, and sometimes does not. [3] Attached is a screenshot of a time when it did not. [size: 40KB]

[3] https://archive.org/search.php?query=Pra%CC%88fixes

This post was modified by aibek on 2014-01-08 10:37:03

Attachment: Pra__fixes.png

Reply to this post
Reply [edit]

Poster: Administrator, Curator, or StaffJeff Kaplan Date: Jan 8, 2014 6:50am
Forum: texts Subject: Re: bug in text search

i will check but i believe that a basic search will include title and description[0] but does not search all fields. it looks like that item and many of the google items have multiple description fields and those are not in the basic search. in this case Präfixes is in description[1] and benutzten Werke und Ausgaben is in description field[2] so they don't appear in results.

you can see this in the meta.xml at https://archive.org/download/dievorsilbeveru00leopgoog/dievorsilbeveru00leopgoog_meta.xml

i will pass the information along. thanks.

This post was modified by Jeff Kaplan on 2014-01-08 14:50:32

Reply to this post
Reply [edit]

Poster: aibek Date: Jan 8, 2014 8:41pm
Forum: texts Subject: Re: bug in text search

Hi Jeff!

A couple of issues.

1) Notes are searched (in general). See:
https://archive.org/search.php?query=%22Extremely%20narrow%20margins%22%
As mentioned originally, Präfixes somehow does not work, perhaps because it contains a non-ASCII character? Anyway, it is a bug! †

2) Description[1] and [2] are not searched even when we specifically ask for a search in the Description. (via the Advanced search page)
http://archive.org/search.php?query=description%3A%28Pra%CC%88fixes%29

3) As no schema is supplied for the XML, it is impossible to say if the XML (meta.xml) is valid, but I think the XML philosophy is being violated by having more than one Description field! The code creating the XML ought to merge all the descriptions into one! (I can supply references to the philosophy issue if you want me to.)
The search engine code rightly assumes that only one Description field is present. Note that it correctly assumes that there may be more than one Creator field.
http://archive.org/search.php?query=creator%3A%28Edgren%29%20AND%20creator%3A(Whitney)

An unrelated issue:
4) Some of the MARC data never becomes part of the searchable metadata! In the following marc.xml file, the field 245.c has been ignored by the code creating the meta.xml file. A search for a few words from the field results in nothing.
https://ia601305.us.archive.org/31/items/gri_33125000755724/gri_33125000755724_marc.xml
https://ia601305.us.archive.org/31/items/gri_33125000755724/gri_33125000755724_meta.xml
http://archive.org/search.php?query=%28hrsg.%20von%20Konrad%20Kirch%29

---
† As mentioned originally, I think that I once got the desired page when searching for Präfixes, but all further searches results nothing. I will check for it a few more times.

This post was modified by aibek on 2014-01-09 04:41:54